annealed importance sampling
Score-Based Diffusion meets Annealed Importance Sampling
More than twenty years after its introduction, Annealed Importance Sampling (AIS) remains one of the most effective methods for marginal likelihood estimation. It relies on a sequence of distributions interpolating between a tractable initial distribution and the target distribution of interest which we simulate from approximately using a non-homogeneous Markov chain. To obtain an importance sampling estimate of the marginal likelihood, AIS introduces an extended target distribution to reweight the Markov chain proposal. While much effort has been devoted to improving the proposal distribution used by AIS, by changing the intermediate distributions and corresponding Markov kernels, an underappreciated issue is that AIS uses a convenient but suboptimal extended target distribution. This can hinder its performance. We here leverage recent progress in score-based generative modeling (SGM) to approximate the optimal extended target distribution for AIS proposals corresponding to the discretization of Langevin and Hamiltonian dynamics. We demonstrate these novel, differentiable, AIS procedures on a number of synthetic benchmark distributions and variational auto-encoders.
Score-Based Diffusion meets Annealed Importance Sampling
More than twenty years after its introduction, Annealed Importance Sampling (AIS) remains one of the most effective methods for marginal likelihood estimation. It relies on a sequence of distributions interpolating between a tractable initial distribution and the target distribution of interest which we simulate from approximately using a non-homogeneous Markov chain. To obtain an importance sampling estimate of the marginal likelihood, AIS introduces an extended target distribution to reweight the Markov chain proposal. While much effort has been devoted to improving the proposal distribution used by AIS, by changing the intermediate distributions and corresponding Markov kernels, an underappreciated issue is that AIS uses a convenient but suboptimal extended target distribution. This can hinder its performance.
Complexity Analysis of Normalizing Constant Estimation: from Jarzynski Equality to Annealed Importance Sampling and beyond
Guo, Wei, Tao, Molei, Chen, Yongxin
Given an unnormalized probability density $\pi\propto\mathrm{e}^{-V}$, estimating its normalizing constant $Z=\int_{\mathbb{R}^d}\mathrm{e}^{-V(x)}\mathrm{d}x$ or free energy $F=-\log Z$ is a crucial problem in Bayesian statistics, statistical mechanics, and machine learning. It is challenging especially in high dimensions or when $\pi$ is multimodal. To mitigate the high variance of conventional importance sampling estimators, annealing-based methods such as Jarzynski equality and annealed importance sampling are commonly adopted, yet their quantitative complexity guarantees remain largely unexplored. We take a first step toward a non-asymptotic analysis of annealed importance sampling. In particular, we derive an oracle complexity of $\widetilde{O}\left(\frac{d\beta^2{\mathcal{A}}^2}{\varepsilon^4}\right)$ for estimating $Z$ within $\varepsilon$ relative error with high probability, where $\beta$ is the smoothness of $V$ and $\mathcal{A}$ denotes the action of a curve of probability measures interpolating $\pi$ and a tractable reference distribution. Our analysis, leveraging Girsanov theorem and optimal transport, does not explicitly require isoperimetric assumptions on the target distribution. Finally, to tackle the large action of the widely used geometric interpolation of probability distributions, we propose a new normalizing constant estimation algorithm based on reverse diffusion samplers and establish a framework for analyzing its complexity.
Score-Based Diffusion meets Annealed Importance Sampling
More than twenty years after its introduction, Annealed Importance Sampling (AIS) remains one of the most effective methods for marginal likelihood estimation. It relies on a sequence of distributions interpolating between a tractable initial distribution and the target distribution of interest which we simulate from approximately using a non-homogeneous Markov chain. To obtain an importance sampling estimate of the marginal likelihood, AIS introduces an extended target distribution to reweight the Markov chain proposal. While much effort has been devoted to improving the proposal distribution used by AIS, by changing the intermediate distributions and corresponding Markov kernels, an underappreciated issue is that AIS uses a convenient but suboptimal extended target distribution. This can hinder its performance.
Variational Learning of Gaussian Process Latent Variable Models through Stochastic Gradient Annealed Importance Sampling
Xu, Jian, Du, Shian, Yang, Junmei, Ma, Qianli, Zeng, Delu
Gaussian Process Latent Variable Models (GPLVMs) have become increasingly popular for unsupervised tasks such as dimensionality reduction and missing data recovery due to their flexibility and non-linear nature. An importance-weighted version of the Bayesian GPLVMs has been proposed to obtain a tighter variational bound. However, this version of the approach is primarily limited to analyzing simple data structures, as the generation of an effective proposal distribution can become quite challenging in high-dimensional spaces or with complex data sets. In this work, we propose an Annealed Importance Sampling (AIS) approach to address these issues. By transforming the posterior into a sequence of intermediate distributions using annealing, we combine the strengths of Sequential Monte Carlo samplers and VI to explore a wider range of posterior distributions and gradually approach the target distribution. We further propose an efficient algorithm by reparameterizing all variables in the evidence lower bound (ELBO). Experimental results on both toy and image datasets demonstrate that our method outperforms state-of-the-art methods in terms of tighter variational bounds, higher log-likelihoods, and more robust convergence.
- North America > United States (0.14)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.65)
Ensemble-Based Annealed Importance Sampling
Sampling from a multimodal distribution is a fundamental and challenging problem in computational science and statistics. Among various approaches proposed for this task, one popular method is Annealed Importance Sampling (AIS). In this paper, we propose an ensemble-based version of AIS by combining it with population-based Monte Carlo methods to improve its efficiency. By keeping track of an ensemble instead of a single particle along some continuation path between the starting distribution and the target distribution, we take advantage of the interaction within the ensemble to encourage the exploration of undiscovered modes. Specifically, our main idea is to utilize either the snooker algorithm or the genetic algorithm used in Evolutionary Monte Carlo. We discuss how the proposed algorithm can be implemented and derive a partial differential equation governing the evolution of the ensemble under the continuous time and mean-field limit. We also test the efficiency of the proposed algorithm on various continuous and discrete distributions.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.47)
Surrogate Likelihoods for Variational Annealed Importance Sampling
Variational inference is a powerful paradigm for approximate Bayesian inference with a number of appealing properties, including support for model learning and data subsampling. By contrast MCMC methods like Hamiltonian Monte Carlo do not share these properties but remain attractive since, contrary to parametric methods, MCMC is asymptotically unbiased. For these reasons researchers have sought to combine the strengths of both classes of algorithms, with recent approaches coming closer to realizing this vision in practice. However, supporting data subsampling in these hybrid methods can be a challenge, a shortcoming that we address by introducing a surrogate likelihood that can be learned jointly with other variational parameters. We argue theoretically that the resulting algorithm permits the user to make an intuitive trade-off between inference fidelity and computational cost. In an extensive empirical comparison we show that our method performs well in practice and that it is well-suited for black-box inference in probabilistic programming frameworks.
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Annealed Importance Sampling with q-Paths
Brekelmans, Rob, Masrani, Vaden, Bui, Thang, Wood, Frank, Galstyan, Aram, Steeg, Greg Ver, Nielsen, Frank
Annealed importance sampling (AIS) is the gold standard for estimating partition functions or marginal likelihoods, corresponding to importance sampling over a path of distributions between a tractable base and an unnormalized target. While AIS yields an unbiased estimator for any path, existing literature has been primarily limited to the geometric mixture or moment-averaged paths associated with the exponential family and KL divergence. We explore AIS using $q$-paths, which include the geometric path as a special case and are related to the homogeneous power mean, deformed exponential family, and $\alpha$-divergence.
Efficient Evaluation of the Partition Function of RBMs with Annealed Importance Sampling
Mazzanti, Ferran, Romero, Enrique
Probabilistic models based on Restricted Boltzmann Machines (RBMs) imply the evaluation of normalized Boltzmann factors, which in turn require from the evaluation of the partition function Z. The exact evaluation of Z, though, becomes a forbiddingly expensive task as the system size increases. This even worsens when one considers most usual learning algorithms for RBMs, where the exact evaluation of the gradient of the log-likelihood of the empirical distribution of the data includes the computation of Z at each iteration. The Annealed Importance Sampling (AIS) method provides a tool to stochastically estimate the partition function of the system. So far, the standard use of the AIS algorithm in the Machine Learning context has been done using a large number of Monte Carlo steps. In this work we show that this may not be required if a proper starting probability distribution is employed as the initialization of the AIS algorithm. We analyze the performance of AIS in both small- and large-sized problems, and show that in both cases a good estimation of Z can be obtained with little computational cost.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Spain (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Annealed Importance Sampling for Structure Learning in Bayesian Networks
Niinimäki, Teppo Mikael (University of Helsinki) | Koivisto, Mikko (University of Helsinki)
We present a new sampling approach to Bayesian learning of the Bayesian network structure. Like some earlier sampling methods, we sample linear orders on nodes rather than directed acyclic graphs (DAGs). The key difference is that we replace the usual Markov chain Monte Carlo (MCMC) method by the method of annealed importance sampling (AIS). We show that AIS is not only competitive to MCMC in exploring the posterior, but also superior to MCMC in two ways: it enables easy and efficient parallelization, due to the independence of the samples, and lower-bounding of the marginal likelihood of the model with good probabilistic guarantees. We also provide a principled way to correct the bias due to order-based sampling, by implementing a fast algorithm for counting the linear extensions of a given partial order.